-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-44718][SQL] Match ColumnVector memory-mode config default to OffHeapMemoryMode config value #42394
Conversation
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM cc @cloud-fan @Ngone51 @jiangxb1987
The failing CIs seems to be unrelated test failures. |
import org.apache.spark.sql.internal.SQLConf | ||
import org.apache.spark.sql.test.SharedSparkSession | ||
|
||
class ConfigColumnVectorModeDefaultSuite extends SharedSparkSession { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's probably too much to add a new test suite for it. We have tests for the config framework and .fallbackConf
is already tested by ConfigEntrySuite
. Shall we just remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed :)
The failure is unrelated, I'm merging it to master, thanks! |
…ffHeapMemoryMode config value ### What changes were proposed in this pull request? Set the column vector default memory mode to depend on the off-heap memory mode flag. This is to prevent a user from using Vectorized-Reader with an on-heap column-vector by default when the off-heap memory mode is enabled on the cluster. ### Why are the changes needed? Avoid the unintentional usage of on-heap memory in vectorized-reader when off-heap memory mode is enabled by the user. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual & existing tests. Closes apache#42394 from majdyz/offheap-colvec-mode-default-value. Lead-authored-by: Zamil Majdy <[email protected]> Co-authored-by: Zamil Majdy <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
…ffHeapMemoryMode config value ### What changes were proposed in this pull request? Set the column vector default memory mode to depend on the off-heap memory mode flag. This is to prevent a user from using Vectorized-Reader with an on-heap column-vector by default when the off-heap memory mode is enabled on the cluster. ### Why are the changes needed? Avoid the unintentional usage of on-heap memory in vectorized-reader when off-heap memory mode is enabled by the user. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual & existing tests. Closes apache#42394 from majdyz/offheap-colvec-mode-default-value. Lead-authored-by: Zamil Majdy <[email protected]> Co-authored-by: Zamil Majdy <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
…ffHeapMemoryMode config value ### What changes were proposed in this pull request? Set the column vector default memory mode to depend on the off-heap memory mode flag. This is to prevent a user from using Vectorized-Reader with an on-heap column-vector by default when the off-heap memory mode is enabled on the cluster. ### Why are the changes needed? Avoid the unintentional usage of on-heap memory in vectorized-reader when off-heap memory mode is enabled by the user. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual & existing tests. Closes apache#42394 from majdyz/offheap-colvec-mode-default-value. Lead-authored-by: Zamil Majdy <[email protected]> Co-authored-by: Zamil Majdy <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit afcccb4)
…lumnVector.offheap.enabled's doc field ### What changes were proposed in this pull request? Followup of #42394 ``` * - spark.sql.columnVector.offheap.enabled - When true, use OffHeapColumnVector in ColumnarBatch. Defaults to ConfigEntry(key=spark.memory.offHeap.enabled, defaultValue=false, doc=If true, Spark will attempt to use off-heap memory for certain operations. If off-heap memory use is enabled, then spark.memory.offHeap.size must be positive., public=true, version=1.6.0). - <value of spark.memory.offHeap.enabled> - 2.3.0 ``` The doc field shall be interpolated by MEMORY_OFFHEAP_ENABLED.key instead of MEMORY_OFFHEAP_ENABLED. In this PR, we remove the doc redundant doc as it's also can be found in the `MEMORY_OFFHEAP_ENABLED.defaultValueString` ### Why are the changes needed? docfix ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? manually debugging ### Was this patch authored or co-authored using generative AI tooling? no Closes #47165 from yaooqinn/minor2. Authored-by: Kent Yao <[email protected]> Signed-off-by: Kent Yao <[email protected]>
…lumnVector.offheap.enabled's doc field ### What changes were proposed in this pull request? Followup of apache#42394 ``` * - spark.sql.columnVector.offheap.enabled - When true, use OffHeapColumnVector in ColumnarBatch. Defaults to ConfigEntry(key=spark.memory.offHeap.enabled, defaultValue=false, doc=If true, Spark will attempt to use off-heap memory for certain operations. If off-heap memory use is enabled, then spark.memory.offHeap.size must be positive., public=true, version=1.6.0). - <value of spark.memory.offHeap.enabled> - 2.3.0 ``` The doc field shall be interpolated by MEMORY_OFFHEAP_ENABLED.key instead of MEMORY_OFFHEAP_ENABLED. In this PR, we remove the doc redundant doc as it's also can be found in the `MEMORY_OFFHEAP_ENABLED.defaultValueString` ### Why are the changes needed? docfix ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? manually debugging ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#47165 from yaooqinn/minor2. Authored-by: Kent Yao <[email protected]> Signed-off-by: Kent Yao <[email protected]>
What changes were proposed in this pull request?
Set the column vector default memory mode to depend on the off-heap memory mode flag. This is to prevent a user from using Vectorized-Reader with an on-heap column-vector by default when the off-heap memory mode is enabled on the cluster.
Why are the changes needed?
Avoid the unintentional usage of on-heap memory in vectorized-reader when off-heap memory mode is enabled by the user.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manual & existing tests.